Solvable model for template coexistence in protocells 
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Compartmentalization of self-replicating molecules (templates) in protocells is a necessary step 
towards the evolution of modern cells. However, coexistence between distinct template types inside 
a protocell can be achieved only if there is a selective pressure favoring protocells with a mixed 
template composition. Here we study analytically a group selection model for the coexistence 
between two template types using the diffusion approximation of population genetics. The model 
combines competition at the template and protocell levels as well as genetic drift inside protocells. At 
the steady state, we find a continuous phase transition separating the coexistence and segregation 
regimes, with the order parameter vanishing linearly with the distance to the critical point. In 
addition, we derive explicit analytical expressions for the critical steady-state probability density of 
protocell compositions. 

PACS numbers: 87.10.Ed, 89.75.Fb, 05.70.Jk 



I. INTRODUCTION 

Explaining the coexistence among selfish individuals is 
an alluring issue that pervades all disciplines concerned 
with the emergence and stability of complex structures 
evolving under the rules of natural selection. A series of 
classical quandaries can be traced back to this issue such 
as the plankton paradox in Ecology [1 , the tragedy of 
commons in Sociology j2j and the information crisis of 
prebiotic evolution [3], just to name a few. The many 
ingenious solutions proposed to tackle those dilemmas, 
e.g., nonequilibrium predator-mediated coexistence [3], 
coexistence in flows with chaotic mixing , kin selection 
[Hj , reciprocal altruism i cyclic cooperative interactions 
[8] and group selection [9], have become major research 
topics by themselves. 

Here we revisit the problem of coexistence of selfish 
individuals using the group selection framework in the 
context of prebiotic evolution [T0Ul3j . We recall that, 
in this context, coexistence is hypothesized to circum- 
vent Eigen's paradox of the origin of life [S] - no large 
genome without enzymes, and no enzymes without a 
large genome - by assuming that, initially, each short 
template coded for a small piece of a modular enzyme 
(see, e.g., [H]). The (haploid) individuals are viewed as 
self-replicating molecules or templates and the groups as 
protocells or vesicles that are themselves capable of repro- 
duction. This is then a prototypical multilevel selection 
problem |15j which has been analyzed chiefly through 
Monte Carlo simulations and numerical iteration of re- 
cursion equations. The two dynamics that govern the 
competition between templates and between protocells 
are coupled because the reproduction rate of the proto- 
cells depends on their template composition. 

The evolutionary processes we consider here are indi- 
vidual selection and group selection. In addition, since 



we assume that the population of templates inside each 
protocell is large but finite, random genetics drift plays 
an important role too, especially in the hindering of tem- 
plate coexistence. The population of protocells, however, 
is assumed infinite. We consider two distinct types of 
templates only and use the so-called diffusion approxi- 
mation to obtain the partial differential equation that 
determines the proportion of protocells with a given com- 
position of templates at a given time [16]. Analysis of 
the (singular) steady-state solution reveals a continuous 
phase transition separating the coexistence regime, in 
which a fraction of protocells exhibit a mixed composi- 
tion of templates, from the segregation regime, in which 
there is no coexistence of template types inside a pro- 
tocell. More pointedly, we show that the order param- 
eter vanishes linearly with the distance to the critical 
line, and derive an analytical expression for the critical 
steady-state solution. 



II. THE MODEL 

The population is divided into an infinite number of 
protocells each of which carrying exactly N templates. 
There are two types of templates - type 1 and type 2 - 
which differ only by their replication efficiencies: type 1 
has a selective disadvantage s > relative to type 2. We 
denote by x € [0, 1] the frequency of type 1 templates 
within a protocell (the frequency of type 2 templates is 
then I — x), and by (f>{x,t) the probability density of 
protocells with a fraction x of type 1 templates. Here 
we assume that N is sufficiently large so that x can be 
treated as a continuous variable. Hence (j) (x, t) Aa; yields 
the proportion of protocells carrying type 1 templates 
with frequency lying in the range {x,x -\- Ax). There is 
no mutation between the template types and there is no 
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migration of templates between protocells. 

Within each protocell, template reproduction follows 
the rules of the standard Wright-Fisher model [T7]. In 
particular, we assume that the individual selection coef- 
ficients are 1 — s and 1 for type 1 and type 2 templates, re- 
spectively. So, if the frequency of type 1 templates within 
a given protocell is x before reproduction, then that fre- 
quency becomes x = x {1 — s) / {1 — sx) !v x — sx {1 — x) 
after reproduction. We assume, as usual, that s is on 
the order of This deterministic process is followed 

by the random sampling of N templates: the probabil- 
ity that the given protocell carries exactly i — 0, . . . ,N 
templates of type 1 is given by the binomial 



(1) 



so that the frequency of type 1 templates after determin- 
istic selection and random sampling is x' = i/N. The way 
this procedure changes the probability density (j}{x,t) is 
derived using the diffusion approximation of population 
genetics [T7], which consists essentially on the calcula- 
tion of the first two jump moments ^(x' — x)") to obtain 
the drift (n = 1) and the diffusion {n — 2) terms of a 
Fokker-Planck equation (see eq. |4]) . 

The competition between protocells is taken into ac- 
count as follows. Denoting by c (x) the selection coeffi- 
cient of a protocell with a fraction x of type 1 templates 
we have 

(j){x,t + At) = [<j) {x, t) + c (x) (j) (x, t) At] C (2) 

where C is such that dxcf) {x,t + At) — 1, i.e, C — 
1/(1 + cAt) with 



c (x) (f) (x, t) dx. 



(3) 



Finally, taking the limit Ai — > we obtain the change in 
the fraction of protocells due to intercell selection, A(/) = 
[c(x) - c](/)(x,i) At. 

Combining the changes in cf) due to the Wright-Fisher 
process and the intercell selection results in the equation 

m 



d 

Ft' 



2dx^ 



dx 



[a(x)0] + [c(x) - c] (4) 



where b{x) — x{l — x)/N, a{x) — — sx(l — x), and c 
is given by eq. ([s]). Equation Q was first derived by 
Kimura aiming at studying the efficiency of group selec- 
tion on the evolution and maintenance of an altruistic 
character ll6_. In our notation, the altruists are the type 
1 templates, which have a selective disadvantage rela- 
tive to type 2 or nonaltruistic templates. In the altru- 
ism context, the intercell selection coefficient or group 
selection pressure must be an increasing function of x, so 
that the higher the frequency of altruists in a group, the 
greater is the selective advantage of the group. In partic- 
ular, Kimura has chosen the linear dependence c(x) cx x 



and included the effects of mutation and migration in his 
analysis [16]. It should be noted, however, that the in- 
troduction of mutation and migration, which affects only 
the drift term a(x) in eq. Q, actually greatly simplifies 
the analysis since these mixing processes guarantee the 
existence of a regular equilibrium distribution [18] . 

Here we consider the coexistence problem instead, 
which is considerably more taxing to group selection than 
the altruistic version summarized above. In fact, despite 
their handicap at the individual level, the altruistic tem- 
plates have a nonzero probability of being fixed in small 
groups solely through the effect of random drift, whereas 
this very effect is a major pressure against the coexistence 
between different templates [12] . To favor coexistence we 
choose the intercell selection coefficient 



c (x) = cx (1 — x) 



(5) 



which is maximum for well-balanced protocells at which 
X = 1/2. Here c is a parameter on the order of 1/iV which 
measures the intensity of the group selection pressure 
towards coexistence. 



III. ANALYSIS OF THE STEADY STATE 



At the steady state d<j)/dt ~ 0, eq. Q reduces to 

+ {Cx(\-x)-C\(^ = (6) 

where S = Ns and C = Nc are now parameters that can 
take on arbitrary positive values, and 



C = C x{l- x)(j}{x)dx. 



(7) 



The solution cj) has to be found in the interval [0, 1] and 
since the extremes of this interval are absorbing barriers 
it can be written in the general form 



(j) (x) = ^0 ^ (x) + AiS {1 - x) + Br] (x) 



(8) 



where Aq, Ai and B are positive constants (weights) that 
depend on the parameters S and C only. Here rj (x) is 
a regular function that satisfies the second order differ- 
ential equation ^ in the open interval (0, 1) and can 
be eventually continued in the interval [0, 1] by defining 
r] (0) = lim2;_5.o r] (x) and rj (1) = limx-i-i r] (x). In addi- 
tion, imposing the normalization rj (x) dx = 1 we have 



and 



C 



An + Ai + B = l 



BC I x{l- x)f]{x)dx. 
'o 



(9) 



(10) 



We note that eq. (|8| implies that, in the general case, 
the population is composed of three types of protocells: 
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homogeneous protocells carrying type 1 templates only, 
homogeneous protocells carrying type 2 templates only, 
and heterogeneous protocells carrying any arbitrary mix- 
ture of the two templates. The proportion of each type 
in the infinite protocell population is Aq, Ai and B, re- 
spectively. 

Next we derive two most useful relations between the 
values of 77 (x) at the extremes a; = and x = 1 and the 
weights that appear in eq. ([s]). Integrating eq. ^ over 
the interval [— e, e] and neglecting terms of order of e and 
higher yield Aq — Bi] (0) /C or, equivalently. 



Ao 



r,(0) 



C x{l — x)r] (x) dx 



(11) 



where we have used ^ [x{l — x)ri{x)] = x{l — x)-^ri{x) + 
(1 - 2x)t]{x), 77(e) =%(0) + o(e) and T]{-e) = 0. Here 
o(e) contains all terms of order e or higher. Similarly, 
integration of eq. (|6| over the interval [1 — e, 1 -I- e] yields 



Ai = 



C /q x{1 — x)ri (x) dx 



(12) 



We note that B = 1 — Aq — Ai can be inserted in eq. 



( 10 1 so that the regular solution rj can be obtained au- 



tonomously (see eq. (15 1 for a similar operation). 



The numerical procedure to find the regular solution 
T] (x) of eq. ^ is greatly facilitated if we define the aux- 
iliary function y (x) as 



r){x)^R exp {-Sx/2) y (x) 



(13) 



where R — 1/ dx exp {~Sx/2) y (x) is introduced to 
guarantee the normalization of 77, leaving us free to im- 
pose the normalization dx y{x) = 1 on the auxiliary 
function y. It is easy to verify from eq. ([6]) that in the 
open interval (0,1) we have 



dx 



2 (1 - x) y] + Tx{l~x)y 



Cy 



(14) 



where T — C — S'^ /A and C is obtained by integration of 



eq. (14 1 in the interval (0, 1) as 



C = T x[l-x)y{x)dx-y[Q)-y{l) (15) 

JO 

where ?/(0) = limj;_>o 2/(2;) ^nd 2/(1) = Imix^i y{x). Since 
the differential equation ( 14 1 is symmetrical around x — 



1/2, we assume that the solution y [x) is symmetrical too, 
so that y(0) = y{l). 



IV. NUMERICAL SOLUTION 

At this stage our mathematical problem is ready for a 
numerical approach. We are left with a Sturm-Liouville 
problem, eq. (14), without boundary conditions which 



only [12]. Of course, this requirement is satisfied pro- 
vided that C is the eigenvalue of the Sturm-Liouville 
problem, which depends solely on the value of F. In 
practice we determine C by propagating the solution us- 
ing the Runge-Kutta algorithm from a; = 0tox = l/2, 
from a; = 1 to a; = 1/2, and then requiring that the 
two solutions join smoothly at x = 1/2 (see page 746 of 
|20|). We use an arbitrary value for y(0) = y{\) to be- 
gin the Runge-Kutta iterations, and the first derivatives 
2/'(0) = -y'(l) = (1 C/2) 7/(0) as given by eq. (|l4|. 
The arbitrariness of the initial condition has no effect on 
the resulting eigenvalue C and it is mitigated by impos- 
ing the normalization condition on y{x). In fact, since 
the coefficients Aq and Ai given by eqs. (11) and ( [T2| , 
as well as B = 1 — Aq — Ai^ depend on the ratio of terms 
involving 77 (x) (and so y {x)) our arbitrary choice of 7/(0) 
is inconsequential. Finally, we note that eq. (15) does 



not provide any additional information about y or C - 
it is derived from eq. ( 14 ) - and so it is not used in our 
numerical procedure. 

The dependence of the eigenvalue C on F exhibited in 
the upper panel of fig. jl] reveals the existence of a critical 
value Fc below which C = 0. This critical point separates 
two distinct steady-state regimes regarding the possibil- 
ity of coexistence of the two types of templates within the 
same protocell, and so C can be seen as the order param- 
eter of our problem. We note that C/F (the coefficient 
of the linear term in the expansion of C in powers of F) 
varies very slowly with increasing F (see lower panel of 
fig.[T]), being confined to the interval [0.205, 0.25]. In fact, 
since in the limit of large F we have F w C ^ 1 the group 
selection pressure favoring coexistence becomes the dom- 
inant force leading to an ideal coexistence scenario, i.e., 
(j)^S{x- 1/2), for which C/F « 1/4 (see eq. 

Before we carry on the characterization of the two 
steady-state regimes separated by Fc, let us show how 
the value of this critical parameter can be calculated. 



THE TRANSITION LINE 



Setting C — and defining (x) 



can be solved by requiring the regularity of y (x) in [0, 1] 



x{l-x)y{x), 
eq. ( 14 ) reduces to the harmonic oscillator equation 

d'^ijj/dx'^ + FcV" = 0. The solution ijj — KcCOS (tI^^x^ + 

Kg sin ^Fc^^a:^ must vanish at a; = and a; = 1, other- 
wise y [x) would not be normalizable. This implies that 
Kc — 0, and Fc — iP' or, equivalently, 

Cc-^2 + ^. (16) 

Hence, at the critical point we have y^ (a;) = 
Kg sin (vra;) / [a; (1 — x)\ where the normalization factor is 
= 1//q da;sin(7ra;)/[a;(l-x)] « 0.270 . Finally, 
returning to the original regular probability density we 
write 

7,c {x) = i?c«« exp {-Sxl2) !^^^!^ (17) 

a; (1 — a;) 
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FIG. 1: Eigenvalue C as function of T = C - S^/A. For F < 
Fc — TT^ we find C = 0, which characterizes the segregation 
regime. The dashed line in the upper panel is C = a (F — Fc) 
with a given by eq. ( |23| ). The lower panel shows the ratio 
C /r in a magnified scale of F, where the dashed line is the 
fitting C/r = 0.25 - 0.47/F^/^ 



where i?c is the normalization constant that appear in 
eq. (13 1 evaluated at F = Tc- 



As a consistency check we will show now that C = at 
C ~ Cc- In fact, eq. (17) yields ric (0) — RcKgir, ijc (1) ~ 
RcKsT^ejq){—S/2), and 



X {1 — x) rjc (x) dx 



RcKsTT 



[1 + exp (-S'/2)] . (18) 



Inserting these results in eqs. (11) and (|12| yields 



° 1 + exp {-S/2) 



and 



Since + A\ 
3^ = and C 



exp i-S/2) 
1 + exp (-5/2) ■ 



(19) 



(20) 



: 1 it follows from eqs. ^ and ([To]) that 
0. 



VI. CHARACTERIZATION OF THE 
STEADY-STATE REGIMES 



The first regime is associated to the parameter region 
C < Cc, and is characterized by C = 0. According to eq. 
([t]), its protocell distribution (f) is a. sum of Dirac deltas 
centered at x = and x = 1, i.e., i? = in eq. Since 
variation of C does not favor any of the template types, 
the weights of the Dirac deltas Aq and Ai in this regime 
must equal their values aX C — Cc given by eqs. ( 19 ) and 
(20), respectively. Clearly, the two types of templates 



cannot coexist within the same protocell and so we refer 
to this regime as the segregation regime. 

The second steady-state regime is characterized by 
C > and so admits the existence of protocells in which 
the two types of templates coexist. The quantity of in- 
terest here is the weight B (see eq. ([8|) which gives the 
fraction of protocells carrying the two template types, 
regardless of their frequencies inside the protocells. In 
fig. [2] we present the dependence of B on the distance 
to the critical point F — Fc = C — Cc for different val- 
ues of the (rescaled) individual selection coefficient S. 
Note that although C depends on F only, the weights 
Aq, Ai and B associated to the different protocell types 
exhibit an explicit dependence on S as well, which comes 
through the function rj (see eq. (13)). In particular, for 
a fixed group selection pressure, increase of the selective 
advantage of type 2 templates reduces the fraction B of 
protocells that exhibit coexistence between templates, as 
expected. More pointedly, we will show that, close to the 
critical point, B decreases with 1/S for increasing S. 

In fig. [3] we show the regular part of the density of 
probability of protocells rj {x) for different group selec- 
tion intensities. The protocells are unbalanced in this 
figure because of the choice 5* = 1 which confers a se- 
lective advantage to type 2 templates in the competition 
at the individual level that goes on inside the protocells. 
However, as C increases this unbalance diminishes and 
eventually the population is completely dominated by 
well-balanced protocells so that 'q{x) — >■ 5 {x — 1/2). 

Next we derive explicit analytical expressions for the 
behavior of the order parameter C as well as for the coex- 
istence probability B near the critical point Fc. Multiply- 
ing eq. (14) by exp (—Ax) and integrating in the interval 



(0, 1) yield 



C/Q = (F + A^) / cxp(-Ax)x(l -a;)y(x)da; 



-2/(0) [l + exp(-A)] 



(21) 



with Q = 1/ j}dx exp {—Xx) y (x). This equation re- 
duces to eq. (nsl) when A = whereas for A = 5/2 it 
coincides with the relation that is obtained by combining 
eqs. ([9|, ([lO|, ([12]) and ([Ts]) and taking into account 
that F = C - 574. 
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FIG. 2: Fraction of protocells carrying the two template types 
as function of the distance to the critical point C— Cc ~ F — Fc 
for (top to bottom) 5" = 0, 5, 10 and 20. 




FIG. 3: Regular normalized solution rj [x) for 5=1 and 
C — C'c ~ 10.12, 20 and 100 as indicated in the figure. 



Taking X = in and recalling that Fc = tt^ we obtain 

X (I ~ x) sin (ttx) y (x) dx , , 

C = (F - Fc) ^ — ^ — '-^^ (22) 

sin (ttx) y (x) dx 

where we have used the fact that y (x) is symmetric about 
X = 1/2. For F close to Fc we can replace y by yc and 
write C « a (F — Fc) where 



2 / ' ""^""i rfx' 
x{l~x) 



0.205. 



(23) 



Now it is easy to obtain the behavior of the coexistence 
probability B near the critical poinL In fac t, o nce the 
behavior of C is known, use of eqs. ( 10 1 and ( 18 ) allows 



us to write B ^ (3 (T — Fc) where 

, ^ sin (ttx) , 

/? = T^T^ / exp {~Sx 2) / \ dx. 

7r[l + exp(-S'/2)J 7o x [1 - x) 



For S* = we have f3 = a/ 2% Kg 
5 > 1 we find /3 w 2a/S. 



(24) 

0.121 whereas for 



VII. CONCLUSION 

Group selection and, more generally, structured pop- 
ulation arguments aiming at explaining altruistic be- 
havior and eusociality in nature have been a source of 
controversy since they were first proposed in the 1960s 
[211 [22] (see [23] and accompanying refutations for a re- 
cent clash). However, use of group selection ideas to 
explain the coexistence of selfish individuals is a much 
less loaded issue, at least within the prebiotic evolution 
context, since there seems to exist a consensus that the 
compartmentalization of templates was a necessary evo- 
lutionary step towards the modern cell [53J [2S] . Regard- 
less of the relevance and controversies surrounding the 
role of group selection in nature, the challenging mathe- 
matical models used to describe the resulting multilevel 
selection problem have been viewed as an attraction on 
their own [13 [2M25] . Here we follow this tradition and 
offer a numerical and analytical solution to a nontrivial 
group selection model. 

We build on the work of Kimura [16] and consider a 
group selection pressure towards the coexistence of two 
types of templates with distinct replication rates. We 
find an explicit expression for the critical (i.e., minimum) 
intensity of group selection needed to balance the seg- 
regation effects of genetic drift and individual selection, 
thus guaranteeing the existence of protocells carrying the 
two template types (see eq. (16)). In addition, we derive 



analytical expressions for the steady-state distribution at 
the critical point and show that the transition between 
the regimes of coexistence and segregation is continu- 
ous with the order parameter vanishing linearly with the 
distance to the critical point. The model is nontrivial 
because it preserves the main effect of finite populations 
- genetic drift - which is revealed when the values of 
the rescaled parameters C = Nc and 5* = Ns decrease 
towards zero. Since Cc — > tt^ in this case, template co- 
existence is ruled out for small protocells due to the seg- 
regating effect of genetic drift. 

From a mathematical perspective, our work departs 
from previous population genetic studies using the dif- 
fusion approximation due to the absence of the mix- 
ing and regularizing processes of mutation and migra- 
tion |I7j . Inclusion of these processes would guarantee 
the existence of a regular equilibrium distribution for the 
Fokker-Planck-like equation Q. Use of prescription (|8|, 
however, allows us to single out the singularities at the 
absorbing barriers a; = and x = 1, whereas the relations 



(111 and (12) provide a link between the regular part of 
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the equilibrium distribution and the weights of the sin- 
gular parts. This solution strategy can be applied to a 
variety of problems characterized by the superposition of 
absorbing and extended steady states. 
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